Active Transfer Learning
نویسندگان
چکیده
Transfer learning algorithms are used when one has sufficient training data for one supervised learning task (the source task) but only very limited training data for a second task (the target task) that is similar but not identical to the first. These algorithms use varying assumptions about the similarity between the tasks to carry information from the source to the target task. Common assumptions are that only certain specific marginal or conditional distributions have changed while all else remains the same. Moreover, not much work on transfer learning has considered the case when a few labels in the test domain are available. Alternatively, if one has only the target task, but also has the ability to choose a limited amount of additional training data to collect, then active learning algorithms are used to make choices which will most improve performance on the target task. These algorithms may be combined into active transfer learning, but previous efforts have had to apply the two methods in sequence or use restrictive transfer assumptions. This thesis focuses on active transfer learning under the model shift assumption. We start by proposing two transfer learning algorithms that allow changes in all marginal and conditional distributions but assume the changes are smooth in order to achieve transfer between the tasks. We then propose an active learning algorithm for the second method that yields a combined active transfer learning algorithm. By analyzing the risk bounds for the proposed transfer learning algorithms, we show that when the conditional distribution changes, we are able to obtain a generalization error bound of O( 1 λ∗ √ nl ) with respect to the labeled target sample size nl, modified by the smoothness of the change (λ∗) across domains. Our analysis also sheds light on conditions when transfer learning works better than no-transfer learning (learning by labeled target data only). Furthermore, we consider a general case where both the support and the model change across domains. We transform both X (features) and Y (labels) by a parameterized-location-scale shift to achieve transfer between tasks. On the other hand, multi-task learning attempts to simultaneously leverage data from multiple domains in order to estimate related functions on each domain. Similar to transfer learning, multi-task problems are also solved by imposing some kind of “smooth” relationship among/between tasks. We study how different smoothness assumptions on task relations affect the upper bounds of algorithms proposed for these problems under different settings. Finally, we propose methods to predict the entire distribution P (Y ) and P (Y |X) by transfer, while allowing both marginal and conditional distributions to change. Moreover, we extend this framework to multi-source distribution transfer. We demonstrate the effectiveness of our methods on both synthetic examples and real-world applications, including yield estimation on the grape image dataset, predicting air-quality from Weibo posts for cities, predicting whether a robot successfully climbs over an obstacle, examination score prediction for schools, and location prediction for taxis. Acknowledgments First and foremost, I would like to express my sincere gratitude to my advisor Jeff Schneider, who has been the biggest help during my whole PhD life. His brilliant insights have helped me formulate the problems of this thesis, brainstorm on new ideas and exciting algorithms. I have learnt many things about research from him, including how to organize ideas in a paper, how to design experiments, and how to give a good academic talk. This thesis would not have been possible without his guidance, advice, patience and encouragement. I would like to thank my thesis committee members Christos Faloutsos, Geoff Gordon and Jerry Zhu for providing great insights and feedbacks on my thesis. Christos has been very nice and he always finds time to talk to me even if he is very busy. Geoff has provided great insights on extending my work to classification and helped me clarified many notations/descriptions in my thesis. Jerry has been very helpful in extending my work on the text data and providing me the air quality dataset. I feel very fortunate to have them as my committee members. I would also like to thank Professor Barnabás Póczos, Professor Roman Garnett and Professor Artur Dubrawski, for providing very helpful suggestions and collaborations during my PhD. I am very grateful to many of the faculty members at Carnegie Mellon. Eric Xing’s Machine Learning course has been my introduction course for Machine Learning at Carnegie Mellon and it has taught me a lot about the foundations of machine learning, including all the inspiring machine learning algorithms and the theories behind them. Larry Wasserman’s Intermediate Statistics and Statistical Machine Learning are both wonderful courses and have been keys to my understanding of the statistical perspective of many machine learning algorithms. Geoff Gordon and Ryan Tibshirani’s Convex Optimization course has been a great tutorial for me to develop all the efficient optimizing techniques for the algorithms I have proposed. Further I want to thank all my colleagues and friends at Carnegie Mellon, especially people from the Auton Lab and the Computer Science Department at CMU. I would like to thank Dougal Sutherland, Yifei Ma, Junier Oliva, Tzu-Kuo Huang for insightful discussions and advices for my research. I would also like to thank all my friends who have provided great support and help during my stay at Carnegie Mellon, and to name a few, Nan Li, Junchen Jiang, Guangyu Xia, Zi Yang, Yixin Luo, Lei Li, Lin Xiao, Liu Liu, Yi Zhang, Liang Xiong, Ligia Nistor, Kirthevasan Kandasamy, Madalina Fiterau, Donghan Wang, Yuandong Tian, Brian Coltin. I would also like to thank Prof. Alon Halevy, who has been a great mentor during my summer internship at google research and also has been a great help in my job searching process. Finally I would like to thank my family, my parents Sisi and Tiangui, for their unconditional love, endless support, and unwavering faith in me. I truly thank them for shaping who I am, for teaching me to be a person who would never lose hope and give up.
منابع مشابه
Testing the Structural Model of Job Characteristics, Organizational Climate and Extra-Organizational Factors on the Transfer of Education with the Role Mediation of Strategies Transfer
The purpose of this study was to investigate the role of job factors, constructive organizational climate and extra-organizational factors on the transfer of learning with the mediating role of learning transfer mechanisms on the consequences of learning. The research method was descriptive-survey and based on structural equations. The statistical population of the study included all managers, ...
متن کاملTesting the Structural Model of Job Characteristics, Organizational Climate and Extra-Organizational Factors on the Transfer of Education with the Role Mediation of Strategies Transfer
The purpose of this study was to investigate the role of job factors, constructive organizational climate and extra-organizational factors on the transfer of learning with the mediating role of learning transfer mechanisms on the consequences of learning. The research method was descriptive-survey and based on structural equations. The statistical population of the study included all managers, ...
متن کاملHierarchical Active Transfer Learning
We describe a unified active transfer learning framework called Hierarchical Active Transfer Learning (HATL). HATL exploits cluster structure shared between different data domains to perform transfer learning by imputing labels for unlabeled target data and to generate effective label queries during active learning. The resulting framework is flexible enough to perform not only adaptive transfe...
متن کاملJoint Transfer and Batch-mode Active Learning
Active learning and transfer learning are two different methodologies that address the common problem of insufficient labels. Transfer learning addresses this problem by using the knowledge gained from a related and already labeled data source, whereas active learning focuses on selecting a small set of informative samples for manual annotation. Recently, there has been much interest in develop...
متن کاملActive Transfer Learning under Model Shift
Transfer learning algorithms are used when one has sufficient training data for one supervised learning task (the source task) but only very limited training data for a second task (the target task) that is similar but not identical to the first. These algorithms use varying assumptions about the similarity between the tasks to carry information from the source to the target task. Common assump...
متن کاملAn Adaptive Congestion Alleviating Protocol for Healthcare Applications in Wireless Body Sensor Networks: Learning Automata Approach
Wireless Body Sensor Networks (WBSNs) involve a convergence of biosensors, wireless communication and networks technologies. WBSN enables real-time healthcare services to users. Wireless sensors can be used to monitor patients’ physical conditions and transfer real time vital signs to the emergency center or individual doctors. Wireless networks are subject to more packet loss and congestion. T...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016